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Abstract 


Ensembles of numerical forecasts based on perturbed initial conditions have long been used to improve estimates of 
both weather and climate forecasts. The Goddard Earth Observing System (GEOS) Atmosphere-Ocean General 
Circulation Model, Version 5 (GEOS-5 AOGCM) Seasonal-to-Interannual Forecast System has been used routinely 
by the GMAO since 2008, the current version since 2012. A coupled reanalysis starting in 1980 provides the initial 
conditions for the 9-month experimental forecasts. Once a month, sea surface temperature from a suite of 11 
ensemble forecasts is contributed to the North American Multi-Model Ensemble (NMME) consensus project, which 
compares and distributes seasonal forecasts of ENSO events. Since June 2013, GEOS-5 forecasts of the Arctic sea- 
ice distribution were provided to the Sea-Ice Outlook project. The seasonal forecast output data includes surface 
fields, atmospheric and ocean fields, as well as sea ice thickness and area, and soil moisture variables. The current 
paper aims to document the characteristics of the GEOS-5 seasonal forecast system and to highlight forecast biases 
and skills of selected variables (sea surface temperature, air temperature at 2 m, precipitation and sea ice extent) to 
be used as a benchmark for the future GMAO seasonal forecast systems and to facilitate comparison with other 
global seasonal forecast systems. 
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1 Introduction 


Deterministic numerical weather prediction forecasts have a forecasting window that is limited to about 15 days 
(e.g., Lorenz, 1963, 1993). As noted by Palmer and Anderson (1993, 1994) and others, useful predictability 1s 
possible beyond this limit in part because boundary forcing such as sea surface temperatures or soil moisture (Koster 


and Suarez, 2001) may vary slowly and reliably, and may then influence statistics of the atmosphere. In 2010, the 
US National Academies reported on the state of seasonal-to-interannual predictability, and suggested avenues for 
progress (Weller et al., 2010). Among the recommendations was the need to establish and evaluate a multi-model 
ensemble, which was recognized as a viable strategy for resolving forecast uncertainty (e.g., Kirtman et al., 2014). 
The NASA Global Modeling and Assimilation Office (GMAO) has participated in the North American Multimodel 
Ensemble (NMME; Kirtman et al., 2014) since its inception. The purpose of the NMME 1s to advance the 
capabilities of the climate prediction models, and utilize the system in a near-operational mode to demonstrate 
feasibility. The GMAO system 1s based on its use and experience with data assimilation methods that have been 
developed for mission support and to enhance NASA’s program of earth observations. The development and use of 
the seasonal forecasting system enhances the use of NASA data and contributes to observing system science by 
improving assimilation systems and atmosphere and ocean modeling tools. Evaluation of the GMAO system has 
previously been conducted with a focus on the predictability of the El Nifo/Southern Oscillation phenomenon 
(ENSO; Ham et al., 2014; Vernieres et al., 2012). In this paper, we provide a more comprehensive assessment of 
current forecasting system as it reaches the end of its life cycle. As the system has now progressed through several 
years within the NMME near-operational mode, this paper critically examines recent performance. 


The layout of the paper is as follows. Section 2 provides an overview of the GMAO Goddard Earth Observing 
System (GEOS) Atmosphere-Ocean General Circulation Model, version 5 (GEOS-5 AOGCM) Seasonal Forecast 
System. Section 3 details the initialization procedure for each system component, and the means of ensemble 
generation through field perturbations and sampling in time. Section 4 presents an assessment of the forecast sea 
surface temperature (SST). Section 5 presents the bias and skills of the relevant atmospheric fields, including 2-m air 
temperature (T2M) and precipitation. Section 6 examines the prognostic sea ice cover. Conclusions are presented in 
section 7. 


2 Overview of the GEOS-5 Seasonal Forecast System: Model 
Components 


The GEOS-5 AOGCM has been developed to simulate climate variability on a wide range of time scales, from 
synoptic time scales to multi-century climate change, and has been tested in coupled simulations and in data 
assimilation mode. The ocean and atmosphere exchange fluxes of momentum, heat and freshwater through a “skin 
layer” interface which includes parameterization of the diurnal cycle and a sea ice model. All components are 
coupled together using the Earth System Modeling Framework (ESMF, Hill et al. 2004). The goal in having a multi- 
scale modeling system with its different components communicating through a unified interface (ESMF) is to be 
able to propagate improvements made to a physical process in one component to the other components smoothly and 
efficiently. The GEOS-5 AOGCM was configured to participate in the Coupled Model Intercomparison Project 
phase 5 (CMIP5), which provides a standard protocol for evaluation of coupled GCMs. To evaluate the model's 
ability to simulate the Earth's climate, it was validated against observational data and reanalysis products. 


2.1 Atmospheric Component 


The atmospheric component of the GEOS-5 AOGCM is Fortuna-2.5, the same that was used for the Modern-Era 
Retrospective Analysis for Research and Applications (MERRA; Rienecker et al. 2011), but with adjusted 
parameterization of moist processes and turbulence (Molod et al. 2012). The model has a finite volume dynamical 
core (Lin 2004), which is integrated with various physical packages through the ESMF. 

The physics package includes parameterization of moist processes, radiation, turbulent mixing and surface fluxes. 
The moist component contains parameterization of convection using the Relaxed Arakawa-Schubert scheme 
(Moorthi and Suarez 1992), and the large-scale precipitation and cloud cover model as described in Bacmeister et al. 


(2006). The radiation component includes parameterization for long wave (Chou 1990, 1992) and short wave 
radiation processes (Chou et al 1994). The turbulence component consists of parameterization for vertical 
diffusivity, the planetary boundary layer and gravity wave drag. The free atmospheric turbulent diffusivities are 
based on the gradient Richardson number. The parameterization of the boundary layer is based on Lock et al. (2000) 
scheme, acting together with scheme of Louis and Geleyn (1982). The Lock et al. (2000) scheme includes a 
representation of non-local mixing (driven by both surface fluxes and cloud-top processes) in unstable layers, either 
coupled to or decoupled from the surface, and an explicit entrainment parameterization. The original scheme was 
extended in GEOS-5 to include moist heating and entrainment in the unstable surface parcel calculations. GEOS-5 
incorporates two gravity wave drag parameterizations, an orographic gravity wave drag formulation based on 
McFarlane (1987), and a formulation for non-orographic waves based on Garcia and Boville (1994). The surface 
exchange of heat, moisture and momentum between the atmosphere and land, ocean or sea ice surfaces are treated 
with a bulk exchange formulation based on Monin-Obukhov similarity theory. 

The atmospheric model uses a Cartesian grid with a 1° x 1.25° horizontal resolution and 72 hybrid vertical levels 
with the upper most level at 0.01 hPa. 


2.2 Ocean Component 


The ocean component of the GEOS-5 AOGCM is the Modular Ocean Model version 4 (MOM4) developed at 
Geophysical Fluid Dynamics Laboratory (Griffies 2012). It 1s a non-Boussinesq, hydrostatic, primitive equations 
model with a staggered Arakawa B-grid or C-grid and generalized level (vertical) coordinate based on depth or 
pressure. A tripolar grid is used to resolve the Arctic Ocean without polar filtering (Murray, 1996). The nominal 
resolution of the ocean grid 1s '2°, with a meridional equatorial refinement to ‘4°. It is a regular Cartesian grid south 
of 65°N, and curvilinear north of 65°N, with two poles located on land to eliminate the problem of vanishing cell 
area at the geographic North Pole. The resulting tripolar grid has a minimum and maximum resolution of 15 km and 
52 km, respectively.. The ocean topography is derived from the ETOPOS data set (Smith and Sandwell 1997).The 
topography is represented as a partial bottom step to better represent topographically influenced advective and wave 
processes. Vertical mixing follows non-local K-profile parameterization of Large et al. (1994) and includes 
parameterizations of tidal mixing on continental shelves (Lee et al. 2006) as well as breaking internal gravity waves 
(Simmons et al. 2004). Mesoscale eddy transport uses the method developed by Ferrari et al. (2010), modifying the 
isoneutral method developed by Gent and McWilliams (1990). The restratification effect of submesoscale eddies 
uses the theory developed by Fox-Kemper et al. (2008) and implementation by Fox-Kemper et al. (2011).The 
horizontal viscosity uses the anisotropic scheme of Large et al. (2001) for better representation of equatorial 
currents. The exchange with marginal sea is parameterized under coarse resolution as discussed in Griffies et al 
(2004). 


2.3 Sealce Component 


The sea ice component of the GEOS-5 AOGCM is the Community Ice CodE, version 4 (CICE; Bailey et al. 
2010,Hunke 2008) developed at Los Alamos National Laboratory. The model includes several interacting 
components to allow for semi-implicit coupling between the atmosphere and ice surface: a thermodynamic model 
that computes local growth rates of snow and ice due to vertical conductive, radiative and turbulent fluxes, along 
with snowfall; a model of ice dynamics, which predicts the velocity field of the ice pack based on a model of the 
material strength of the ice; a transport model that describes advection of the area concentration, ice volumes and 
other state variables; and a ridging parameterization that transfers ice among thickness categories based on energetic 
balances and rates of strain. A skin layer interface 1s used for the exchange of basal heat, salt, and freshwater fluxes 
with the underlying MOM4 ocean model; ice pressure is not exerted on the ocean. The CICE model is configured 
with standard settings but without the use of melt ponds. 


2.4 Land Component 

The land surface model in the GEOS-5 AOGCM is a catchment-based hydrological model described in Koster et al. 
(2000). In this model, subgrid heterogeneity in surface moisture state is treated statistically. The applied subgrid 
scale distributions are related to the topography, which exerts a major control over much of the subgrid variability. 
The catchment model is coupled to the multi-layer snow model described in Stieglitz et al. (2001). 
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3 Overview of the GEOS-5 Seasonal Forecast System: Initial State 
Generation 


3.1 Atmosphere Initialization 


In the coupled model initialization, selected atmospheric variables are constrained with the Modern-Era 
Retrospective Analysis for Research and Application (MERRA; Rienecker et al. 2011). These variables include 
surface pressure, pressure thickness, zonal and meridional winds, specific humidity, ozone concentration, and 
potential temperature. 


3.2 Ocean and Sea-Ice Initialization 


The Goddard Earth Observing System integrated Ocean Data Assimilation System (GEOS-1ODAS) 1s used for both 
ocean state and sea ice initialization for the production of analysis products (MERRA-Ocean). The ocean and sea-ice 
initialization methodology is described in detail in Vernieres et. al. (2012) and Keppenne et al. (2008). An overview 
of the initialization procedure relevant to the hindcasts 1s presented here. 
The assimilated observing system consists of: 
* sea surface temperature observations from CMIP5 (Hurrell et al. 2008) prior to 1982 and Reynolds 
(Reynolds et al. 2007) from 1982 to present; 
¢ temperature and salinity profiles from eXpendable Bathythermographs (XBTs) and Conductivity 
Temperature Depth (CTD) sensors extracted from the EN3 data base (Ingleby and Huddleston 2007) with 
time-varying XBT corrections applied according to Levitus et al. (2009), the tropical moored buoy array 
(McPhaden et al. 2010) - TAO/TRITON, PIRATA, and RAMA arrays and Argo floats, with profiles from 
the Argo Global Data Assembly Center (GDAC); 
* sea ice concentration from the National Snow and Ice Data Center (NSIDC). 
The NSIDC sea-ice concentrations product is based on passive microwave observations of ice concentration from 
the Nimbus-7 Scanning Multi-channel Microwave Radiometer (SMMR) and the Defense Meteorological Satellite 
Program (DMSP) Special Scanning Microwave Imager (SSM/I) and Special Scanning Microwave Imager/Sounder 
(SSMIS). It has a 25 km spatial resolution for both the north and south polar regions. Temporal resolution is every 
other day from October 1978 to July 1987 (SMMR), then daily from August 1987 to present (SSM/I, SSMIS). Ice 
concentrations from CMIP5 and Reynolds are used in areas that are not measured due to orbit inclination (poleward 
of 87.2° for SSM/I and 84.5° for SMMR). 
The above observations are assimilated using an ensemble optimal interpolation technique (Oke et al. 2010, Wan et 
al. 2010) with 5-day window from 1979 to present. The model is also weakly constrained to the World Ocean Atlas 
2009 (WOAO9) gridded climatology (Antonov et al. 2010; Locarnini et al. 2010) of T(z) and S(z) at 1° degree 
resolution and from 0 to 4500m and of Sea surface salinity (SSS) to correct some of the model's biases, particularly 
prior to the Argo era. 
The resulting analysis (MERRA-Ocean) has been extensively diagnosed through The Ocean Reanalyses 
Intercomparison Project (Balmaseda et al. 2015) in terms of various parameters such as mixed-layer depth, 
thermocline depth, heat and salinity content, overturning circulation, etc. 


3.3. Land 


An important aspect of the GEOS-5 initialization concerns the treatment of the land. Observed precipitation data are 
used to construct a corrected version of the hourly MERRA (or GEOS-S5 forward processing) precipitation fields, 
which are then used to force the land surface and generate enhanced soil moisture initial conditions for initializing 
the GEOS-5 seasonal forecasts. The corrections to the precipitation are obtained using the Global Precipitation 
Climatology Project version 2.1(GPCPv2.1, provided by the NASA/Goddard Space Flight Center's Laboratory for 
Atmospheres, which develops and computes the dataset as a contribution to the GEWEX Global Precipitation 
Climatology Project) and Climate Prediction Center (CPC) Merged Analysis of Precipitation (CMAP, provided by 
the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from their web site at http://www.esrl.noaa.gov/psd/) pentad 
precipitation data following Reichle et al. (2011). As the first step, the CMAP dataset is rescaled to match the 
(seasonally variable) long-term climatology of the GPCP. During the second step, hourly MERRA total precipitation 
is time averaged and re-gridded to the scale of the correcting CMAP dataset (1.e., to pentad and 2.5° resolution). 
Next, separately for each pentad of each year and for each 2.5° grid cell, a scaling factor is computed by determining 
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the ratio of the (climatologically adjusted) CMAP estimate to the MERRA data (1.e., on the grid and at the time scale 
of the correcting observations). Finally, these scaling factors are re-gridded back to the MERRA grid and a scaling 
factor derived for a given grid cell and year/pentad is applied to the MERRA precipitation rates (large-scale 
precipitation, convective precipitation, and snowfall separately) in each of the 120 hourly time steps within that 
pentad. By construction, the corrected MERRA precipitation is nearly identical to the CMAP estimates at the pentad 
and 2.5° resolution. The diurnal cycle, the frequency and relative intensity of rainfall events at the sub-pentad scale, 
and the sub-2.5° spatial variations are entirely based on MERRA estimates. 


3.4 Sampling in Time 

Each month, the GMAO produces an ensemble of 12 (13 in November) real-time GEOS-5 coupled model forecasts. 
The ensemble is produced by initializing the model every five days (table 1) prior to the start of the month, except 
for the date closest to the start of the month when additional GEOS-5 forecasts are generated by various 
perturbation methods (tables 2 and 3). The perturbations are produced using a breeding approach (perturbing the 
atmosphere and/or ocean), and a simple scaled differencing approach involving nearby (in time) atmosphere and 
ocean states. We note that due to time constraints only 11 ensemble members are delivered to the (North American 
Multi-Model Ensemble (NMME) project (forecasts are due by the 8th of the month). 

In addition to the forecasts, the GMAO produced a suite of hindcasts (1982-2012) used to calibrate/bias correct the 
forecasts and assess forecast skill. The ensemble members for the hindcasts are produced in the same way as for the 
forecasts and span the period 1982-2012. 

Table 1 Seasonal Forecast Schedule 
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Table 1 shows the start dates for the ensemble members of the GMAO Seasonal forecasts and hindcasts. The bold 
shaded values denote the closest dates to the start of the month for which additional ensemble members are 


generated using various perturbation methods. 


3.5 Perturbations 


Ensembles of numerical forecasts based on perturbed initial conditions have long been used to improve estimates of 
both weather and climate forecasts. The GEOS-5 seasonal forecast is arranged so that it uses a set of 7 members for 
the forecast initialized on the day closest to the beginning of the month, and one member otherwise for a combined 
ensemble of 11 members. Initial perturbation method, that adds small perturbation to analysis initial conditions, is 
used to generate the ensemble forecast. Since one goal has been the use of the ensemble spread as an indicator of 
expected forecast skill, bred vectors (Toth and Kalnay, 1993) have been used as perturbations to capture the fastest 
growing modes on weather time scales. More recently, the coupled breeding method was developed for coupled 
atmosphere ocean systems to capture the dominant mode of coupled instabilities associated with the El 
Nino/Southern Oscillation (ENSO) (Cai et al. 2003; Yang et al. 2006, 2008; Ham et al. 2012). 

The breeding is applied from 1980 with the aim of capturing the fastest-growing errors in the seasonal forecasts. 
Two-sided breeding is applied, which means positive and negative bred runs are restarted every month by adding 


216 and subtracting the bred vector to the initial conditions generated from the Ensemble Optimal Interpolation (EnOT) 
217 option of the GEOS ocean data assimilation system, forced with NASA’s Modern-Era Retrospective analysis for 
218 Research and Applications (MERRA) (Rienecker et al. 2011). 

219 The rescaling interval chosen for the breeding is 30 days. The rescaling norm is the RMS difference of the 
220 instantaneous sea surface temperatures (SSTs) from the positive and negative bred runs; the region for defining the 
221 norm is the tropical Pacific domain over 120°E-90°W, and 10°S-10°N. At every re-initialization during the breeding 
222 cycle, perturbations are re-scaled so that the magnitude of the norm is reduced to 10% of the natural variability of 
223 SST over the norm region (i.e. 0.48°C). Another method to perturb initial conditions is based on the GEOS-5 
224 analysis on two different days. Similar to breeding, the perturbations are re-scaled and the magnitude of the norm is 
225 reduced to 10% of the natural variability of SST over the norm region (i.e. 0.48°C). A combination of these two 
226 methods is used in generating the ensemble members for the seasonal forecast. 

22/7 Tables 2, 3 and 4 illustrate the perturbations of the initial conditions for all ensemble members generated at 


228 the beginning of the month that are submitted to NNUME. Additional ensemble members utilize satellite 
229 altimetry data, which do not cover the full NMME hindcast period. Table 2 Ensemble members’ perturbation 
230 combinations 


Initial conditions (IC) perturbations 
Ensemble 3 4 
member 


IC type 


(Ocean |O_  OBe fF OBH OOF OH LOCO 


231 Table 3 Format of IC perturbations 


Perturbation type 
ocean and atmosphere IC are not perturbed 
ocean and atmosphere IC are perturbed using negative bred vectors 
ocean and atmosphere IC are perturbed using positive bred vectors 
ocean and atmosphere IC are perturbed using negative rescaled difference between two 
analyses 
ocean and atmosphere IC are perturbed using positive rescaled difference between two 
analyses 
atmosphere IC are perturbed using negative rescaled difference between two analyses 
ocean IC are perturbed using positive bred vectors 
232 Table 4 List of perturbed variables 


ocean model grid 


sea level and frazil 
ice velocity and strain rate components 


atmosphere model grid 


ice strength, extent and stress tensor components 


233. 4 Forecast Skill: SST 
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The forecast accuracy of the coupled model forecasts is assessed by the amplitude and phase of SST anomaly 
measured for specified regions and by the global patterns of SST. The forecast accuracy of the atmosphere-land 
forecasts will be assessed by the patterns and amplitude of the precipitation and surface temperature anomalies. For 
the arctic sea ice forecasts evaluation, the sea ice extent is compared with observations and against other similar 
systems participating in the Ice Outlook project. 

Fig1 depicts the regions that are used to compute the SST indices routinely used to assess the forecast skills. Other 
regions of interest, used for the case studies of the 2m air temperature and precipitation, are also shown. 


20° 40° 60° 60° 100° 120° 140° 160° 780° 160° 140° 120° 100° 60° 60° 40° 20° Qo" 20° 


Nino 4 


Nino 3.4 


Fig 1Regions used in SST forecast skill assessment are shown by the colored, blue and green, annotated rectangles, Nifo 3.4 
region is hatched. The three regions shown by pink rectangles are used in the 2m air temperature and precipitation evaluation 


In the equatorial Pacific Ocean the easternmost Nino 1+2 region boundaries are0°-10°S, 90°W-80°W, the eastern 
Nino 3 region boundaries are 5°N-5°S, 150° W-90°W, the central Nino 3.4 region boundaries are 5°N-5°S, 170° W- 
120°W and the western Nino 4 region boundaries are 5°N-5°S, 160°E-150°W. In the Indian Ocean the Western 
Tropical Indian Ocean (WTIO) SST anomaly index is calculated in the box 50°E - 70°E, 10°S - 10°N, the 
Southeastern Tropical Indian Ocean (SETIO) SST anomaly index is calculated in the box 90°E - 110°E, 10°S - 0°; 
the Dipole Mode Index (DMI) is calculated as the difference of the WTIO and SETIO indices(Saji et al. 1999). The 
Tropical Atlantic SST Index (TASI) is defined as the differences between the North Atlantic Tropical (NAT) and the 
South Atlantic Tropical (SAT) SST indices, computed in the boxes 40°W - 20°W, 5°N - 20°N and15°W - 5°E, 20°S 
- 5°S respectively (Chang et al. 1997) 


4.1 Forecast Drift 


Forecast drift 1s an artifact of the imperfect models. For the seasonal forecast it is necessary to properly account for 
the drift and calibrate the forecast accordingly. A continuous coupled analysis and a complete set of retrospective 
forecasts for the entire training period are required to consistently de-trend the forecast. In GEOS-5 system the drift 
is calculated as the average of these hindcasts from 1981-2010 for every ensemble member. It is subsequently 
subtracted from the production forecasts. This method of drift removal follows the convention established by 
Stockdale (1997) and others. The forecast bias characteristics are also important to understand for evaluating the 
performance of the current and the future seasonal-to-interannual forecast systems. Comparison of the retrospective 
forecasts to the observations is helpful in determining the model’s skill. 


4.1.1 Global Bias 


Fig 2 shows the global forecast drift from Reynolds SST Climatology for December. Nine panels (top to bottom, left 
to right) correspond to initial conditions 1 month prior to December, 2 months prior and so on, the last panel shows 
the forecast for December initialized in April. This 1s the average drift of all the ensemble members. Immediately 
one can see from the first panel the cold bias appearing during the first month of the forecast in the northwestern 
Atlantic ocean where subpolar surface water displaces the warm, salty water of the North Atlantic Current(Large 
and Danabasoglu 2006) and off the east coast of South America at the confluence of the Brazil current and the 
Antarctic Circumpolar Current exiting Drake Passage. Just as quickly the warm biases develop in the coastal areas 
off the west coasts of South America and southern Africa and off the east coasts of Asia and North America. The 
biases described above are present in all the forecast regardless of the initialization time. The coupled model exhibits 
a cooling trend in the equatorial Pacific Ocean and in the Southern Ocean, but it takes about 4 to 5 months for the 
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large scale SST biases in these regions to reach their maximum value of about 3°C (see panels in the middle row of 
Fig 2, the December forecasts initialized in August and July). There is a cold bias in the southeastern tropical part of 
every of the three major ocean basins (Indian, Pacific, Atlantic) developing over the same time period. 
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Fig 2Monthly mean SST forecast drift with respect to Reynolds climatology for December for every forecast lead time, i.e. 


60E 120E 


December forecast initialized at the beginning of December (lead 1), at the beginning of November (lead 2), up to the 
December forecast initialized in April (lead 9) 


Fig 3 shows the model drift for every forecast for the last (lead 9) month, when the amplitude of the drift 1s expected 
to be the largest. The top left panel on this figure would contain the lead 9 drift of the December forecast (initialized 
in April, shown on the bottom-right panel of the Fig 2, thus omitted here). The order of panels is schematically listed 
in its place: predicted month first and next to it in parenthesis the initialization month. Thus the top row shows the 
model bias for winter (initialized in Apr-Jun), the second row shows the bias for spring (initialized in Jul-Sep), etc. 
The model climatology is colder than Reynolds along the equator in the Pacific Ocean during all seasons, but 
especially so in the fall and winter (initialized in Jan-Mar). The southeastern tropical Pacific and southeastern 
tropical Atlantic cold biases are also present throughout the year, but more pronounced in the boreal winter (Dec- 
Mar) season (initialized in Mar-Jul). In the Indian Ocean, the cold bias in the tropical southeastern part and along 
the equator and concurrent warm bias off the western Australia coast is present only during Dec-Mar (initialized in 
Apr-Jul). 

The SST bias in the northern Pacific Ocean has the strongest seasonality: there is a dipole structure with warm mid 
latitudes and cold tropics in Jul-Sep (initialized in Nov-Jan) with the differences between the model and observed 
climatologies as large as +3°C and -2°C. At the same time a similar pattern of warm mid latitudes/cold tropics bias 
appears in the Atlantic ocean, but the magnitude of the bias is smaller, with warm bias about 1°C and cold about 
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Fig 3Lead 9 monthly mean Forecast SST forecast bias with respect to Reynolds climatology for Jan-Nov predicted months; the 


60E 120E 180E 


month when the forecast was initialized is shown in parenthesis; lead 9 shown (the order of panels is shown in the place of 
Dec(Apr) predicted(initialized) month) 


4.1.2 ENSO,IDM and TASI SST indices 


Fig 4 shows the mean ensemble bias for the Pacific Ocean Nino 3, Nino 4, Nifio 3.4 and Nino 1+2 indices, WTIO, 
SETIO, IDM and TASI computed with respect to Reynolds SST. In the western and central equatorial Pacific Ocean 
(Nifio 4, Nifio 3.4) the model has an exaggerated seasonal cycle with a cold bias of up to 2°C in the boreal fall and 
winter for all forecasts targeting this time period. The fastest drift away from the observations appears in the 
forecasts initialized in summer (July, August). The forecasts initialized in winter and early spring (Jan-Mar) tend to 
stay close to the observations until the onset of summer. In the easternmost Pacific Ocean (Nino 1+2 region) the 
model is biased warm up to 2°C throughout the year with the exception of winter target months, when all forecasts 
return close to the observations. The Nino 3 region has the smallest bias and the most accurate seasonal cycle 
represented by the model. In the Indian Ocean the model in general 1s less biased: the warm bias 1s slightly larger in 
the east than in the west, thus the IDM (Indian Ocean dipole mode index) is slightly biased towards negative values 
during the summer and fall. The Tropical Atlantic SST Index (TASI)— the difference between the northern and the 


312 southern Atlantic ocean control regions (refer to Fig 1 for their definition) — is negatively biased by approximately 
313 1°C throughout the year, underestimating the absolute value of the gradient between the north and the south index 
314 poles in the summer and fall, and overestimating it in the spring. 
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315 
316 Fig 4Monthly mean SST forecast drift with respect to Reynolds for equatorial Pacific, Indian and Atlantic Ocean indices 


317. 4.2 Forecast Skill 


318 Similarly to the forecast drift discussion, the SST global skill maps are presented first, followed by the analysis of 
319 the regional indices. Anomaly Correlation Coefficient (ACC) is used as a measure of potential skill and Mean 
320 Square Skill Score (MSSS) as a measure of actual skill. MSSS is computed with respect to climatology, 1.e. zero 
321 anomaly case, as follows, 
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Here Ty,<;(i) is the temperature anomaly of the i-th hindcast and T,;,,(@) = 0. 


4.2.1 Global Anomaly Correlation Skill 


Fig 5 shows the global SST ACC computed for all forecast from all initializations months combined, with each 
panel representing the lead months. Lead month | has high correlation (above 0.8) with Reynolds SST in all the 
ocean basins. By lead months 2 and 3, the high correlation remains only in the tropical Pacific and Atlantic oceans. 
The Atlantic Ocean skill drops below 0.6 by lead month 6, but still remains high in the north Atlantic Iceland Basin 
region. Only in a portion of the Equatorial Pacific (Nino 3.4 region) ACC remains above 0.6 by lead month 9. From 
the significance point of view, the skill across equatorial regions in all oceans and in the north Atlantic Iceland Basin 
region remains viable until the final months of the forecast. 
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Fig 5Global monthly SST ACC for all forecast initial months combined together; Reynolds monthly SST is used as observations; 
nine lead months are shown top-to-bottom, left-to-right. Pearson correlation significance test is applied with p_value at 0.01 
(Pearson 1896) 


4.2.2 Oceanic Indices Skill 


Figure 6 and 7 show the Hovmdller diagrams of SST ACC and MSSS vs. the forecast initialization month for 
various SST indices. These illustrate the seasonal variability of the forecast skill. The drift period computed for the 
anomalies is 1993-2010 and the period used for the skill computation is 1993-2014. This period overlaps with the 
ensemble members that use the altimeter data. For the ACC, a Pearson correlation significance test with p-value at 
0.01 is applied. 
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Top left panels in figure 6 show ACC of the western (Nino 4) the eastern (Nino 3) equatorial Pacific indices. 
Overall, albeit significant, the ACC is lower in the Nifo 4 region for most of the spring and especially summer start 
months. In the Nifo 3 region the ACC 1s very high (+0.8) for Jun-Sep start months, which coincides with the 
amplitude growth phase of El Nifo/La Nina. In the Nifio 4 region the highest ACC is attained during the late fall and 
winter forecasts, 1.e. when an El Nifio/La Nina is at its peak and begins to wind down. In the central Pacific Ocean 
(Nino 3.4 region, top row second from the right panel in Figure 6), the anomaly correlation skill is robustly high 
(>0.7) for February to September start months throughout the 9 month forecast. ACC drops sharply beyond May in 
Nifio 3 and Nino 3.4 regions for most forecast started in September-January, which is an indication of the spring 
predictability barrier. In the Nifo 4 region the spring barrier is not as pronounced, with the significant anomaly 
correlation skill retained through June for all forecasts initialized in late autumn and winter months (Oct-Feb). The 
relatively abrupt drop in ACC for forecasts starting in Jun-Aug maybe related to the rapid model drift during the first 
lead months for these forecasts. In the Nifio 1+2 region (top right most panel in Figure 6), ACC spring predictability 
barrier occurs earlier than in the equatorial regions, in March, with the skill dropping below significance level after 
two months in January and December, and 3 months in February forecasts. The best seasons in forecasting this area 
are late spring and early summer. 


SST forecast skill in the Indian Ocean is characterized by the presence of its own predictability barrier. This drop in 
the prediction skill occurs at the onset of the boreal summer monsoon and is found at both IDM poles (Waisowicz 
2007). ACC skill is high beyond the first month only for the WTIO forecasts initialized in Jan-Feb and SETIO 
forecasts initialized in Jul-Aug. In the SETIO ACC there is a second predictability barrier is December, but the skill 
apparently returns later for the forecasts starting in Jul-Sep (left bottom panels in Figure 6).The IDM index defined 
as the difference between WTIO and SETIO indices. Forecasting the relative variability of the two regions in each 
of which the skill is not very robust proves to be a difficult task, although some significant skill beyond the first 
month is observed in the forecasts initialized in May-Nov. Here the December predictability barrier hinted at by the 
SETIO ACC values 1s strongly evident. 

The TASI SST anomaly index is an indicator of the meridional surface temperature gradient in the tropical Atlantic 
Ocean. It was defined by Chang et al. (1997), where it was associated with a potential decadal 'dipole' mode of 
coupled variability in the tropical Atlantic. The GEOS-5 ability to predict the TASI values is robust but short lived: 
for all initialization months except Aug-Oct (and Feb-Mar), the ACC drops below the significance level after 2-3 
months (bottom right panel in Figure 6).The higher skill for these forecasts may be associated with the strength of 
the TASI signal: the amplitude of the index is peaking during these phases of the seasonal cycle (see the bottom 
right panel of Fig 4). 
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Fig 6 SST ACC for all oceanic indices. The Pearson correlation significance test with p-value at 0.01 is applied. Forecast start 
months are along the y-axis and lead months are along the x-axis 
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Fig 7 SST MSSS for all oceanic indices. Forecast start months are along the y-axis and forecast lead months are along the x-axis 


MSSS is a characteristic of how well the anomaly amplitude is forecasted. Even when correlation skill is high, the 
systematic over/under-prediction of the anomaly would lead to a lower MSSS. In equatorial Pacific Ocean indices 
MSSS becomes negative across the spring barrier. In the Nifo 4 region this can be related to an overextension of 
the warm pool to the west, and thus a consistent overestimation of the warm SST anomalies. Fig 1 of the Online 
Resource, showing the historic performance of the GEOS-5 Nifio 4 index, illustrates this point: the El Nifo 
amplitude was overestimated in 1982/83, 1991/92, 1997/98, 2002/03, 2006/07 and 2015/16 cases. Additionally, and 
this 1s evident in all three equatorial indices, the system tends to falsely predict a warming trend(as opposed to the 
neutral condition in reality) for the following spring/summer for forecasts starting in boreal winter. This contributes 
to the drop in anomaly correlation skill (spring barrier) and the low amplitude skull. 

For the Indian Ocean, while ACC/MSSS skills for the Western and Southeastern indices appear to be 
significant/positive for most of the forecasts, both skills for the IDM index are low except for the May-November 
starts, and even for these, the predictable lead time is 2-5 months. This is comparable to other dynamic models (Shi 
et al 2012). The Tropical Atlantic Ocean index skill shows forecast outperforming climatology in terms of error 
absolute value, as well as anomaly correlation, for the short term predictions (2-4 months). 


4.2.3. Case Study: Major El Nino Event of 15/16 

Figure8shows spaghetti plots of the ensemble mean forecasts for each start month for 2015-2016 (in color). 
Observations from Reynolds SST are shown by a solid black line, ocean analysis is shown by a dashed black line. 
The color scale represents the ratio between the forecast absolute departure from the observations and the standard 
deviation of the ensemble at that particular lead time. High values of this measure may be indicative of ensemble 
underdispersion. The 2015/2016 El Nifo was considered a Central Pacific event so of all the indices, the Nifo 4 
index in the Western Pacific exhibited the smallest observed anomaly compared to other Pacific Ocean indices and 
Nino 3.4 had the highest observed anomaly.GEOS-5 overpredicted the magnitude of the SST anomaly at the peak of 
the El Nifio in the western central Pacific (Nifio 4 index) by as much as 1.5°C.The timing was also missed by 
summer and fall (Jun-Nov, 2015) forecasts, they all showed the maximum in January 2016, while it occurred in Nov 
2015. So great was the forecasts departure from the observations, that the latter barely fit within the ensemble 
envelope in October, 2015 through January, 2016, the ensemble mean being as far from the observations as 4 
standard deviations of the ensemble. The maximum of the cooling phase was also overpredicted by more than 1°C, 
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and the timing was too early: the winter and spring (Jan-Jun, 2016) forecasts showed the lowest temperature in 
August 2016, while in reality, the cooling gradually took place over the course of 2016. 

GEOS-5 accurately predicted surface warming in the Nino 3 region as early as March 2015.The following forecasts, 
starting in boreal summer (Jun-Aug, 2015), showed the warming being too early by about 2 months, however the 
timing of the peak SST anomaly in November 2015 - January 2016 was predicted well by all spring and summer 
forecasts except August 2015, which showed the peak in February 2016.This was also the warmest of all the 
predictions; the rest of the forecasts were within 0.5°C of observations, which corresponds to roughly one standard 
deviation of the ensemble. The amplitude of the cooling following the El Nino peak in this region was overpredicted 
by the forecasts initialized in April and May 2016 (they called for a moderate La Nifia), while the rest of the 
forecasts, earlier (Jan-Mar, 2016) and later (June, 2016 onwards) predicted neutral conditions. 
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Fig 8Nifo 4, Nifio 3, Niflo 3.4 and Nifo 1+2 monthly mean SST forecasts; the solid color lines show the ensemble mean, the 
black line is observations (Reynolds SST), the dashed black line is the ocean analysis from which the initial conditions for the 
forecasts were generated. The color scale represents the ratio between the forecast absolute departure from the observations 
and the standard deviation of the ensemble at that particular lead time 


Note the peak of the 2015/2016 El Nifio event that occurred in the NDJ season for the equatorial indices. The 
GEOS-5 model predicted the correct timing and magnitude of this peak for the Nino 3.4 index starting in February 
2015. This index is the one most widely used for ENSO forecasting, thus intercomparison between various models is 
readily available. GEOS-5 model performs similarly to most other models involved in NMME, although more often 
than not, it tends to have stronger ENSO events than other models. 

The Nino 1+2 region reached its peak anomaly in July 2015. The GMAO model predicted the timing of this event 
starting in March but underestimated the magnitude by 0.5°C. The June forecast was very close to the observations 
in timing and magnitude. In all regions, the forecast of the cooling phase of the ENSO starting inMay,2016 was 
exaggerated for all four indices. By July, 2016, start time, the equatorial indices forecasts picked up the transition to 
the neutral conditions. 

Fig 9shows the evolution of the SST and the next figure (Fig 10) shows the equatorial subsurface temperature during 
the onset of the 2015/2016 El Nifo. The overextension of the warm water anomaly to the west of the date line 
clearly shows the forecast difficulty in the Nifo 4 region. It is consistent with lower skills in this area, as noted in 
Ham et al (2014b).The GEOS-5 system exhibited similar behavior during the previous ENSO events (1997/98 El 


437 Nifio and 1982/83 El Nifio, see Online Resource 1-6 illustrating the historic SST indices values in the GEOS-5 
438 forecasts). 
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440 Fig 9 Evolution of the equatorial Pacific Ocean SST during the onset of the 2015-16 El Nifio. Left panel is the monthly mean 
441 forecast SST from May 2015 initial conditions. Right panel is the concurrent MERRA ocean analysis 
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Fig 10Evolution of the subsurface equatorial Pacific Ocean temperature during the onset of the 2015-16 El Nifio. Left panel is 
the forecast monthly mean T from May 2015 initial conditions. Right panel is the concurrent MERRA-ocean analysis 


5 Forecast Bias and Skill: T2M and Precipitation 


Similarly to the SST, but for temperature at 2 meters (T2M) and precipitation, we first present the global forecast 
bias maps and then discuss the regional skills and case studies of two extreme events. 
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5.1 Global Bias 
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Fig 11An example of 2m air temperature seasonal forecast bias for 1 and 3 months lead times. Winter and summer observed 
and predicted fields are shown in the two top rows. The bottom row shows the differences between the model and the 
observations 
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Fig 12 An example of precipitation seasonal forecast bias for 1 and 3 months lead times. Winter and summer observed and 
predicted field are shown in the two top rows. The bottom row shows the differences between the model and the observations 


The bias shown in Figures 11-121s the systematic departure in predicted and observed climatology during the 30 
year (1982-2011) period. For both T2M and precipitation, MERRA-2 (Bosilovich 2016, Molod 2015) data was used 
as the observational validation reference. The first lead month and the third lead month bias for winter and summer 
are showed. The largest bias for T2M is over the winter Arctic Ocean and it increases with lead time. GEOS-5 
underestimates the sea ice temperature in winter. For the northern hemisphere summer, GEOS-5 tends to 
overestimate T2M over land, especially in Asia and the west coast of North America. Overall the bias for T2M 
remains small over the tropical oceans in all seasons over the full length of the forecast. The bias for precipitation 
varies slightly with lead time for both winter and summer. Larger bias is found in the summer than in the winter 
over land in the northern hemisphere. This is possibly due to the fact that summer precipitation over land is more 
likely to be affected by regional and local factors, thus uncertainty in model parameterization as vegetation cover, 
cloud physics etc. could play larger roles in the precipitation bias. 


5.2 Forecast Skill 
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469 Fig 13Monthly mean 2m air temperature anomaly correlation for seasonal forecast and observations (MERRA-2) 
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Fig 14Monthly mean precipitation anomaly correlation for seasonal forecast and observations (MERRA-2) 


Seasonal skills of T2M (Fig 13) and precipitation (Fig 14) are calculated as anomaly correlations between GEOS-5 
forecast and observations. GEOS-5 performs well for T2M for the first lead month for all seasons, especially over 
the tropical oceans. Although the T2M skills over the tropical oceans remains high even after six months, the skills 
over land diminish quickly after the first lead month. Precipitation skills are generally lower than those of T2M, and 
there is hardly any skill after the first month for the extra tropics. However, over the East Pacific Ocean, where 
ENSO has a dominant influence on precipitation, the anomaly correlation remains high until the sixth lead month. 


5.2.1 Regional Average Skills and Case Studies 

For the discussion in this section we consider special regions of particular socio-economic interest: the Amazon 
basin bounded by 80°W-50°W, 20°S-10°N, the Great Plain between 30°N-50°N, 110°W-100°W and Southern India 
between 10°N-15°N, 77°E-80°E. These areas are shown by pink rectangles in Fig 1 and labeled Amazon, GP and SI 
respectively. 
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Fig 15The anomaly correlation for T2M (left) and precipitation (right) between the forecast and MERRA-2 for Amazon River 
basin (top), Great Plain (middle) and Southern India (bottom) regions. Target month (x-axis) represents the date of the 
forecast, and the lead month (y-axis) represents how long that forecast was in months, i.e. target month May with lead 4 
means May forecast initialized in February 


A closer look (Fig 15, top row) at the broad region encompassing the Amazon River basin reveals a good overall 
skill in terms of anomaly correlation for both T2M and precipitation throughout the seasonal forecast. For the 
initialization months Jan-May, it stays above 0.4 up to lead month 9 and for Jun-Jul it starts above 0.6 for the first 
month and drops below 0.4 only after month 6. 

Seasonal forecasts serve as an important prediction tool for extreme events. At the current stage, there are 
uncertainties and difficulties in seasonal forecasts to accurately capture certain events. However, it is intriguing to 
see how GOES-5 performs in various extreme events. Two examples of such extreme events will be discussed next: 
drought over the Great Plains in 2012 and flood over the southern India coast in 2015.Figure 15 (middle and bottom 
panels)shows the T2M and precipitation anomaly correlation skill in these regions. One can see that there 1s little 
correlation between the observations and the predicted precipitation, yet the strong signal during the extreme events 
may give an opportunity for the forecast to capture its characteristics. 
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Fig 16Case studies of the drought over the Great Plains in 2012 and the flood over the Southern India coast in 2015. One to four 
month lead forecasts of T2M and precipitation are plotted against the observations (MERRA-2). For the Great Palin the target 
month is July 2012, for the South India region the target month is November 2015, The anomalies are standardized using the 
corresponding standard deviation 


In the scatter plots in Figl6, one to four month lead forecasts of T2M and precipitation are plotted against the 
observations for the two events. The anomalies are standardized using the corresponding standard deviation. In the 
case of the Great Plain drought, GEOS-5 underestimates the deficit in both temperature and precipitation for all lead 
months. The underestimations are most obvious in the spring initialized forecasts and gradually decrease closer to 
summer. By May and June, GEOS-5 clearly predicts hotter and drier conditions for that summer, although the 
magnitude of the drought is less than in observations. The Great Plain is a region where the local water cycle is 
sensitive to the land surface representations and therefore is sensitive to land initializations. This characteristic of the 
region 1s also presented in this scatter plot. For the case of the southern India flood, GEOS-5 shows a much larger 
model spread for every initialization month. However, the large model spread highlights the low predictability of the 
seasonal forecast. The southern India flood is highly related to the location and strength of the Indian winter 
monsoon. The winter monsoon predictability therefore highly restrains the performance of the GEOS-5 seasonal 
forecast for floods. 


6 Sea Ice Outlook 


Seasonal forecasting systems focus on the ability of coupled atmosphere/ocean models to predict variability in the 
tropical Pacific Ocean and it’s associated higher latitude teleconnections (Kirtman et al. 2014). For middle and high 
latitudes, predictability derived from local oceanic sources has been thought to be limited (e.g., Barsugli and Battisti 
1998). But Arctic sea-ice cover has a decorrelation time scale of up to 5 months (Blanchard-Wrigglesworth et al. 
2011). Moreover, model experiments have indicated sea-ice predictability on seasonal time scales and longer, with 
indications of signal re-emergence beyond one year (Holland et al. 2010; Tietsche et al. 2014; Guemas et al. 2014). 
The prospect of a predictability reservoir has received considerable interest (Richter-Mengeet al. 2012; Stroeve et al. 
2014; Hamilton and Stroeve 2016). Potentially, seasonal forecasts of sea ice have utility for a variety of human 
endeavors including commerce, mineral exploration, and indigenous activities (Stroeve et al. 2015). Arctic sea-ice 
extent 1s also considered a climate variable. Mechanisms controlling its variability and trend are the subject of 
extensive observational and modeling studies (e.g., Perovich and Richter-Menge 2009; Vaughan et al. 2013). The 
presence of floating ice on the ocean radically alters surface properties; it has immediate influence on the exchange 
of energy and moisture between the ocean and the overlying atmosphere. Model experiments have demonstrated the 
impact of ice cover on regional Arctic climate, including air temperature and precipitation (Deser et al. 2010; 
Alexander et al. 2004), and studies have also suggested an influence on large-scale conditions extending beyond the 
immediate Arctic Basin (Thomas et al. 2014). In 2008, a challenge was formulated for comparing and evaluating 
experimental seasonal predictions of the September Arctic sea-ice extent (ARCUS, 2008), which became known as 
the Sea Ice Outlook. Many of the seasonal forecasting systems involved with NMME have participated, including 
the GMAO. Results have been mixed. Assessments have shown that predictions have reduced skill in years where 
the observed ice cover departs significantly from the long-term trend (Hamilton and Stroeve 2016). Subsequent 
evaluation of models participating in the Sea Ice Outlook has shown that forecasts have difficulty surpassing the 
skill of damped persistence, and difficulty predicting each other (Blanchard-Wrigglesworth et al. 2015). 
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Forecasts submitted to the Sea Ice Outlook were composed of the ensemble members initialized at every five days 
of the prior month, along with ten members initialized at the beginning of the month. Unlike the ENSO forecasts and 
the ensemble members submitted to NMME, the sea ice forecasts made use of three experimental members, which 
were produced in hindcast mode beginning in 1993.These experimental members featured the inclusion of altimeter 
data, which was obtained from the Archiving, Validation and Interpretation of Satellite Oceanographic data project 
(AVISO; http://www.aviso.altimetry.fr/)and used in the ocean assimilation. 


SIPN/ARCUS June Sea Ice Outlooks of September Mean Extent: 2014-2016 
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Fig 17Comparison of forecast September Arctic ice extent error from submitted June Sea Ice Outlook models for the period 
2014-2016. Yellow dashed line indicates the average error over the three years. Error is computed as the difference of the 
forecast value minus the extent from passive microwave data (Cavalieri et al. 1996) 
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Fig 18September 2014 forecast and the observed spatial pattern of Arctic sea ice 


As shown in Fig 17, the GMAO system has performed well in comparison to other models over the period in which 
it has participated in the Sea Ice Outlook. For the previous three Outlooks, the average extent error is 0.32+0.22 
x10° km? for the GMAO system as compared to 0.57+0.4210° km” for the average of all dynamical predictions. 
The uncertainty denotes the standard deviation of the forecast errors.Fig18also indicates that the spatial patterns for 
the September 2014 forecast were similar to the observed pattern. Over the hindcast period of 1998 to 2015, the June 
forecast explains 49 percent of the observed September ice extent variance, which may be considered of marginal 
skill. But this belies several critical issues with the forecast system, which are largely associated with initial 
conditions. As previously noted, the MERRA atmospheric reanalysis is used in the ocean assimilation. Cullather and 
Bosilovich (2012) found near-surface air temperatures in MERRA are as much as 10°C too warm in the late Arctic 
spring, owing to an erroneously low, fixed sea-ice albedo used in the uncoupled atmospheric reanalysis. The surface 
temperature bias and its effects on the GEOS-1O0DAS oceanic temperatures led to an anomalous reduction in 
forecast ice cover initialized during early summer months.Fig20indicates the increase in forecast error in summer 
months, such that forecast skill actually becomes reduced with decreasing lead time. This inhibits contributions to 
the Sea Ice Outlook for one- and two-month lead times to the September Arctic ice minimum. The June Outlook 
contributions are based on the prior month’s initialization. Analysis of hindcasts also finds low ice cover for spring 
forecasts initialized during the first ODAS data stream covering the period until 1993, as shown in Fig 19.Low ice 
volume associated with this stream and the interaction with the erroneously warm atmospheric forcing in the 
analysis results in poor ice forecasts over the time period. Hindcast skill improves in the later GEOS-1ODAS stream 
and with the introduction of altimetry-based forecast ensemble members after 1993; as seen in Fig19, variability in 
the ensemble mean forecast is comparable to observed values after 1998. The period from 1998 to the present is 
used for a simple bias correction in the forecast to account for differences with the NSIDC Sea Ice Outlook. This is a 
common practice among Sea Ice Outlook participants. Over the full period, forecast ice cover for the Southern 
Ocean is patchy and not comparable to observation. 
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Fig 19September mean sea ice extent from NSIDC Sea Ice Index (solid black line; Fetterer et al. 2016), and from ensemble 
members of the June forecast (grey lines). The ensemble mean is indicated with a black dashed line, and a bias-correction is 
indicated with a dotted line, in 10° km* 
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Fig 20September sea ice extent from each ensemble member minus the NSIDC Sea Ice Index observed value, plotted based on 
the initialization date. Colors indicate the year of the forecast 


The lessons learned from this initial seasonal forecasting system exercise are useful in the construction of new 
forecast systems. First, the contrast in the competitiveness of the system shown in Figures 17-18with the difficulties 
indicated in Figures 19-20suggest the continued experimental nature of sea-ice forecasts on these time scales, but 
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also that some significant improvement 1s relatively straightforward — for example, with improved atmospheric 
temperature forcing of the ocean analysis in the melt season such as in MERRA-2 (Bosilovich et al. 2016).The 
utility of hindcasts for the general characterization of the sea-ice forecast system suggests that the anomaly 
forecasting approach of NMME is also advantageous for the Arctic. But the simple bias correction approach that has 
been heretofore used is likely problematic. Bias correction of the hemisphere-summed ice extent is a means of 
addressing an inadequate representation of seasonality in the model, but it is mostly used here to address the 
mismatch of the land-sea mask between various models and observing system grids (see for example, Blanchard- 
Wrigglesworth et al. 2016).A better methodology for sea-ice intercomparison between models and observations 1s 
required. The issues shown here emphasize the role of initial climate conditions for improved forecasts. This 
includes an investigation for improving the representation of sea-ice characteristics in the analysis state. 


7 Conclusions 


In this study we provide the details of the GEOS-5 seasonal forecast system setup, which is used in particular to 
provide monthly contributions to the NMME project. The SST, T2M, precipitation and sea ice extent skills are 
documented for the comparison with other systems and to track the differences, hopefully, improvements, with the 
future system currently under development. Notable problems in the current system are the large SST drift in the 
northern Atlantic Ocean (strongest for the forecasts initialized in winter and spring), in the equatorial Pacific Ocean 
(strongest for the forecasts initialized in winter and spring), in the northern Pacific Ocean (for the forecasts 
initialized in late fall and winter) and in the Southern Ocean (for the forecasts initialized in austral winter).Anomaly 
correlation SST skill is poor in the western Pacific Ocean (Nifo 4 index) relative to the central and eastern 
equatorial regions (Nifio 3 and Nino 3.4 indices).This is related to the overextension of the warm water anomaly to 
the west during the El Nifio events. Ham et al (2014b) attributes this to the weak thermal damping and the erroneous 
zonal advective feedback as a response of the wind-driven current to the wind forcing. The strong ocean sensitivity 
to the wind forcing may be mitigated in the future by coupling the ocean with the new improved atmospheric 
analysis (MERRA-2), in which the surface winds are significantly improved compared to MERRA (Bosilovich et al, 
2016), in particular, over the western equatorial Pacific Ocean. 

The largest (negative) bias for T2M is over the winter Arctic Ocean related to the fact that GEOS-5 underestimates 
the sea ice temperature in winter. The largest positive T2M bias occurs during the northern hemisphere summer over 
land, especially in Asia and the west coast of North America.T2M is not significantly biased over the tropical 
oceans. The largest precipitation bias is found in the summer over land in the northern hemisphere, likely related to 
the uncertainty in model parameterization of vegetation cover, cloud physics etc. Anomaly correlation T2M skill is 
high for at least 6 lead months for all seasons over the tropical oceans, but drops quickly (after 1 month) over land. 
Significant precipitation skills in terms of anomaly correlation are found only over the equatorial eastern Pacific 
Ocean, where ENSO has a dominant influence on precipitation. Everywhere else the anomaly correlation drops to 
near zero after the first month of the forecast. 

The experimental sea ice forecast provides a benchmark for the future system evaluation. The current GEOS-5 
system is on par with other comparable models based on the 3 year comparison within the Sea Ice outlook project. 
Yet there are known shortcomings in the sea ice initialization and ocean and atmospheric feedbacks. In addition to 
providing better forcing to the ocean model via MERRA-? analysis, the sea ice forecast has a potential to benefit 
from assimilating new types of observations during the ocean and sea ice initialization procedure, such as sea 
surface height and ice thickness. 

The GEOS-5 seasonal forecast system has been in service since early 2012.Since its inception, new versions of the 
models have become available, and the new ensemble ocean and sea ice assimilation system that is capable of 
processing new data types has been put in place. As the next system is being developed, this paper will be among 
those providing reference for evaluating its performance. 
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